AITopics | green block

Collaborating Authors

green block

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Frictional Agent Alignment Framework: Slow Down and Don't Break Things

Nath, Abhijnan, Graff, Carine, Bachinin, Andrei, Krishnaswamy, Nikhil

arXiv.org Artificial IntelligenceMay-27-2025

AI support of collaborative interactions entails mediating potential misalignment between interlocutor beliefs. Common preference alignment methods like DPO excel in static settings, but struggle in dynamic collaborative tasks where the explicit signals of interlocutor beliefs are sparse and skewed. We propose the Frictional Agent Alignment Framework (FAAF), to generate precise, context-aware "friction" that prompts for deliberation and re-examination of existing evidence. FAAF's two-player objective decouples from data skew: a frictive-state policy identifies belief misalignments, while an intervention policy crafts collaborator-preferred responses. We derive an analytical solution to this objective, enabling training a single policy via a simple supervised loss. Experiments on three benchmarks show FAAF outperforms competitors in producing concise, interpretable friction and in OOD generalization. By aligning LLMs to act as adaptive "thought partners" -- not passive responders -- FAAF advances scalable, dynamic human-AI collaboration. Our code and data can be found at https://github.com/csu-signal/FAAF_ACL.

intervention, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2505.19428

Country: North America > United States (1.00)

Genre:

Research Report > New Finding (0.46)
Personal > Interview (0.45)

Industry: Health & Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Logic-of-Thought: Empowering Large Language Models with Logic Programs for Solving Puzzles in Natural Language

Li, Naiqi, Liu, Peiyuan, Liu, Zheng, Dai, Tao, Jiang, Yong, Xia, Shu-Tao

arXiv.org Artificial IntelligenceMay-23-2025

Solving puzzles in natural language poses a long-standing challenge in AI. While large language models (LLMs) have recently shown impressive capabilities in a variety of tasks, they continue to struggle with complex puzzles that demand precise reasoning and exhaustive search. In this paper, we propose Logic-of-Thought (Logot), a novel framework that bridges LLMs with logic programming to address this problem. Our method leverages LLMs to translate puzzle rules and states into answer set programs (ASPs), the solution of which are then accurately and efficiently inferred by an ASP interpreter. This hybrid approach combines the natural language understanding of LLMs with the precise reasoning capabilities of logic programs. We evaluate our method on various grid puzzles and dynamic puzzles involving actions, demonstrating near-perfect accuracy across all tasks. Our code and data are available at: https://github.com/naiqili/Logic-of-Thought.

large language model, machine learning, puzzle, (19 more...)

arXiv.org Artificial Intelligence

2505.16114

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

On the Limit of Language Models as Planning Formalizers

Huang, Cassie, Zhang, Li

arXiv.org Artificial IntelligenceDec-13-2024

Large Language Models have been shown to fail to create executable and verifiable plans in grounded environments. An emerging line of work shows success in using LLM as a formalizer to generate a formal representation (e.g., PDDL) of the planning domain, which can be deterministically solved to find a plan. We systematically evaluate this methodology while bridging some major gaps. While previous work only generates a partial PDDL representation given templated and thus unrealistic environment descriptions, we generate the complete representation given descriptions of various naturalness levels. Among an array of observations critical to improve LLMs' formal planning ability, we note that large enough models can effectively formalize descriptions as PDDL, outperforming those directly generating plans, while being robust to lexical perturbation. As the descriptions become more natural-sounding, we observe a decrease in performance and provide detailed error analysis.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2412.09879

Country:

Asia > Thailand > Bangkok > Bangkok (0.04)
Asia > Singapore (0.04)
North America > United States (0.04)
(6 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)

Add feedback

Open-World Task and Motion Planning via Vision-Language Model Inferred Constraints

Kumar, Nishanth, Ramos, Fabio, Fox, Dieter, Garrett, Caelan Reed

arXiv.org Artificial IntelligenceNov-12-2024

Foundation models trained on internet-scale data, such as Vision-Language Models (VLMs), excel at performing tasks involving common sense, such as visual question answering. Despite their impressive capabilities, these models cannot currently be directly applied to challenging robot manipulation problems that require complex and precise continuous reasoning. Task and Motion Planning (TAMP) systems can control high-dimensional continuous systems over long horizons through combining traditional primitive robot operations. However, these systems require detailed model of how the robot can impact its environment, preventing them from directly interpreting and addressing novel human objectives, for example, an arbitrary natural language goal. We propose deploying VLMs within TAMP systems by having them generate discrete and continuous language-parameterized constraints that enable TAMP to reason about open-world concepts. Specifically, we propose algorithms for VLM partial planning that constrain a TAMP system's discrete temporal search and VLM continuous constraints interpretation to augment the traditional manipulation constraints that TAMP systems seek to satisfy. We demonstrate our approach on two robot embodiments, including a real world robot, across several manipulation tasks, where the desired objectives are conveyed solely through language.

artificial intelligence, constraint, planning & scheduling, (14 more...)

arXiv.org Artificial Intelligence

2411.08253

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New York (0.04)
(3 more...)

Genre: Research Report (0.54)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)

Add feedback

Your Co-Workers Matter: Evaluating Collaborative Capabilities of Language Models in Blocks World

Wu, Guande, Zhao, Chen, Silva, Claudio, He, He

arXiv.org Artificial IntelligenceMar-30-2024

Language agents that interact with the world on their own have great potential for automating digital tasks. While large language model (LLM) agents have made progress in understanding and executing tasks such as textual games and webpage control, many real-world tasks also require collaboration with humans or other LLMs in equal roles, which involves intent understanding, task coordination, and communication. To test LLM's ability to collaborate, we design a blocks-world environment, where two agents, each having unique goals and skills, build a target structure together. To complete the goals, they can act in the world and communicate in natural language. Under this environment, we design increasingly challenging settings to evaluate different collaboration perspectives, from independent to more complex, dependent tasks. We further adopt chain-of-thought prompts that include intermediate reasoning steps to model the partner's state and identify and correct execution errors. Both human-machine and machine-machine experiments show that LLM agents have strong grounding capacities, and our approach significantly improves the evaluation metric.

agent, inventory, llm agent, (17 more...)

arXiv.org Artificial Intelligence

2404.00246

Country:

Africa > Rwanda > Kigali > Kigali (0.04)
Oceania > Australia > Victoria (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(8 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.33)

Add feedback

BBSEA: An Exploration of Brain-Body Synchronization for Embodied Agents

Yang, Sizhe, Luo, Qian, Pani, Anumpam, Yang, Yanchao

arXiv.org Artificial IntelligenceFeb-12-2024

Embodied agents capable of complex physical skills can improve productivity, elevate life quality, and reshape human-machine collaboration. We aim at autonomous training of embodied agents for various tasks involving mainly large foundation models. It is believed that these models could act as a brain for embodied agents; however, existing methods heavily rely on humans for task proposal and scene customization, limiting the learning autonomy, training efficiency, and generalization of the learned policies. In contrast, we introduce a brain-body synchronization ({\it BBSEA}) scheme to promote embodied learning in unknown environments without human involvement. The proposed combines the wisdom of foundation models (``brain'') with the physical capabilities of embodied agents (``body''). Specifically, it leverages the ``brain'' to propose learnable physical tasks and success metrics, enabling the ``body'' to automatically acquire various skills by continuously interacting with the scene. We carry out an exploration of the proposed autonomous learning scheme in a table-top setting, and we demonstrate that the proposed synchronization can generate diverse tasks and develop multi-task policies with promising adaptability to new tasks and configurations. We will release our data, code, and trained models to facilitate future studies in building autonomously learning agents with large foundation models in more complex scenarios. More visualizations are available at \href{https://bbsea-embodied-ai.github.io}{https://bbsea-embodied-ai.github.io}

brain-body synchronization, green block, red block, (13 more...)

arXiv.org Artificial Intelligence

2402.08212

Country:

Asia > China > Hong Kong (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report (0.63)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

Doing Experiments and Revising Rules with Natural Language and Probabilistic Reasoning

Piriyakulkij, Top, Ellis, Kevin

arXiv.org Artificial IntelligenceFeb-8-2024

We build a computational model of how humans actively infer hidden rules by doing experiments. The basic principles behind the model is that, even if the rule is deterministic, the learner considers a broader space of fuzzy probabilistic rules, which it represents in natural language, and updates its hypotheses online after each experiment according to approximately Bayesian principles. In the same framework we also model experiment design according to information-theoretic criteria. We find that the combination of these three principles -- explicit hypotheses, probabilistic rules, and online updates -- can explain human performance on a Zendo-style task, and that removing any of these components leaves the model unable to account for the data.

experiment, hypothesis, inference, (16 more...)

arXiv.org Artificial Intelligence

2402.06025

Country:

North America > United States (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.64)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Statler: State-Maintaining Language Models for Embodied Reasoning

Yoneda, Takuma, Fang, Jiading, Li, Peng, Zhang, Huanyu, Jiang, Tianchong, Lin, Shengjie, Picker, Ben, Yunis, David, Mei, Hongyuan, Walter, Matthew R.

arXiv.org Artificial IntelligenceDec-4-2023

There has been a significant research interest in employing large language models to empower intelligent robots with complex reasoning. Existing work focuses on harnessing their abilities to reason about the histories of their actions and observations. In this paper, we explore a new dimension in which large language models may benefit robotics planning. In particular, we propose Statler, a framework in which large language models are prompted to maintain an estimate of the world state, which are often unobservable, and track its transition as new actions are taken. Our framework then conditions each action on the estimate of the current world state. Despite being conceptually simple, our Statler framework significantly outperforms strong competing methods (e.g., Code-as-Policies) on several robot planning tasks. Additionally, it has the potential advantage of scaling up to more challenging long-horizon planning tasks. We release our code at https://github.com/ripl/statler

cyan block, query, yellow block, (17 more...)

arXiv.org Artificial Intelligence

2306.1784

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Republic of Türkiye > Aksaray Province > Aksaray (0.04)

Genre:

Research Report (0.50)
Workflow (0.46)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

DoReMi: Grounding Language Model by Detecting and Recovering from Plan-Execution Misalignment

Guo, Yanjiang, Wang, Yen-Jen, Zha, Lihan, Jiang, Zheyuan, Chen, Jianyu

arXiv.org Artificial IntelligenceSep-30-2023

Large language models (LLMs) encode a vast amount of semantic knowledge and possess remarkable understanding and reasoning capabilities. Previous work has explored how to ground LLMs in robotic tasks to generate feasible and executable textual plans. However, low-level execution in the physical world may deviate from the high-level textual plan due to environmental perturbations or imperfect controller design. In this paper, we propose \textbf{DoReMi}, a novel language model grounding framework that enables immediate Detection and Recovery from Misalignments between plan and execution. Specifically, we leverage LLMs to play a dual role, aiding not only in high-level planning but also generating constraints that can indicate misalignment during execution. Then vision language models (VLMs) are utilized to detect constraint violations continuously. Our pipeline can monitor the low-level execution and enable timely recovery if certain plan-execution misalignment occurs. Experiments on various complex tasks including robot arms and humanoid robots demonstrate that our method can lead to higher task success rates and shorter task completion times. Videos of DoReMi are available at \url{https://sites.google.com/view/doremi-paper}.

constraint, green block, red block, (14 more...)

arXiv.org Artificial Intelligence

2307.00329

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

Add feedback

TRAC: A Textual Benchmark for Reasoning about Actions and Change

He, Weinan, Huang, Canming, Xiao, Zhanhao, Liu, Yongmei

arXiv.org Artificial IntelligenceNov-25-2022

Reasoning about actions and change (RAC) is essential to understand and interact with the ever-changing environment. Previous AI research has shown the importance of fundamental and indispensable knowledge of actions, i.e., preconditions and effects. However, traditional methods rely on logical formalization which hinders practical applications. With recent transformer-based language models (LMs), reasoning over text is desirable and seemingly feasible, leading to the question of whether LMs can effectively and efficiently learn to solve RAC problems. We propose four essential RAC tasks as a comprehensive textual benchmark and generate problems in a way that minimizes the influence of other linguistic requirements (e.g., grounding) to focus on RAC. The resulting benchmark, TRAC, encompassing problems of various complexities, facilitates a more granular evaluation of LMs, precisely targeting the structural generalization ability much needed for RAC. Experiments with three high-performing transformers indicates that additional efforts are needed to tackle challenges raised by TRAC.

large language model, logic & formal reasoning, machine learning, (22 more...)

arXiv.org Artificial Intelligence

2211.1393

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
(5 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.70)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback